Modeling the prosody of Vietnamese attitudes for expressive speech synthesis
نویسندگان
چکیده
Attitudes or social affects are strongly implied in interaction processing, and specifically to socio-cultural aspects of language. This paper presents the modeling of attitude to apply in expressive speech synthesis in Vietnamese, an under-resourced tonal language. A prosodic model for Vietnamese attitude is proposed based on the concept of “rendez-vous” between linguistic levels and prosodic functions of utterance. This model is applied to generate the prosody of attitudes in Vietnamese. The perceptual experiment on the synthetic utterances with this model shows that the attitudes are well evaluated.
منابع مشابه
Hierarchical stress generation with Fujisaki model in expressive speech synthesis
This paper introduces a hierarchical stress generation for expressive speech synthesis. In the previous study, we proposed a novel hierarchical Mandarin stress modeling method, and the text-based stress prediction experiments demonstrates a reliable stress assignment can be obtained from textual features. However, the stress model should be further verified to be an effective and efficient pros...
متن کاملModeling of Fundamental Frequency Contour of Thai Expressive Speech using Fujisaki’s Model and Structural Model
Problem statement: In spontaneous speech communication, prosody is an important factor that must be taken into account, since the prosody effects on not only the naturalness but also the intelligibility of speech. Focusing on synthesis of Thai expressive speech, a number of systems has been developed for years. However, the expressive speech with various speaking styles has not been accomplishe...
متن کاملModeling Prosody Pattern of Chinese Expressive Speech and Its Application in Personalized Speech Conversion
This paper proposes an approach for modeling prosody patterns of acoustic features of Chinese expressive speech. In a Chinese multi-syllabic prosodic word, a syllable is identified as the core syllable based on the observation that speaker usually puts more emphasis on such syllable. The variations of the acoustic features migrating from neutral to expressive speech are then analyzed for both t...
متن کاملComparison of chironomic stylization versus statistical modeling of prosody for expressive speech synthesis
Chironomic stylization is the process of real-time modification of intonation contours (f0 and tempo) using drawing/writing gestures with a stylus on a graphic tablet. The question addressed in this research is whether hand-made intonation stylization could improve or degrade expressivity and overall quality, compared to statistical modeling of prosody. A system for expressive TTS in French bas...
متن کاملStudy on Unit-Selection and Statistical Parametric Speech Synthesis Techniques
One of the interesting topics on multimedia domain is concerned with empowering computer in order to speech production. Speech synthesis is granting human abilities to the computer for speech production. Data-based approach and process-based approach are the two main approaches on speech synthesis. Each approach has its varied challenges. Unit-selection speech synthesis and statistical parametr...
متن کامل